Adaptive processing and archiving of compound scanned documents

نویسندگان

  • ROUMEN KOUNTCHEV
  • ROUMIANA KOUNTCHEVA
چکیده

In the paper is presented one new approach for adaptive processing and compression of images of scanned documents, which contain text and pictures. In order to achieve high compression with maximum retained quality, the document content is analyzed and two corresponding regions of interest are set. Then, each region is processed as follows: the text – with lossless and the picture – with lossy compression. The lossy compression is based on the Inverse Pyramid Decomposition, and for the lossless compression is used the adaptive run-length coding method. The quality of the pictures in the restored document is visually lossless, while the text is unchanged. The overall compression efficiency surpasses that of the JPEG 2000 standard for same images. In the paper are presented the basic principles of the Inverse pyramid decomposition and are included some experimental results and comparison with JPEG 2000. Key-Words: Image processing, Image segmentation, Image contents analysis, Lossless image compression, Histogram modification, Inverse pyramid decomposition, Lossy image compression.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Block-based segmentation and adaptive coding for visually lossless compression of scanned documents

This paper presents a novel block-based segmentation and adaptive coding(BSAC) algorithm for visually lossless compression of scanned documents that contain not only photographic images but also text and graphic images. For such compound image source, we structure the image into nonoverlapping blocks and classify each block into four different classes based on the empirical statistics within th...

متن کامل

Preprocessing and Lossless Compression of Visual Biomedical Information

In this work are presented some new approaches for efficient archiving of visual medical information of various kinds. The main attention is aimed at the archiving of scanned paper documents. For this, new algorithms for image preprocessing and object segmentation are presented. The preprocessing is based on adaptive filtration, used to reduce the noises in the image background (corresponding t...

متن کامل

Cooperative and Fast-Learning Information Extraction from Business Documents for Document Archiving

Automatic information extraction from scanned business documents is especially valuable in the application domain of document management and archiving. Although current solutions for document classification and extraction work pretty well, they still require a high effort of on-site configuration done by domain experts or administrators. Especially small office/home office (SOHO) users and priv...

متن کامل

Automatic indexing of scanned documents: a layout-based approach

Archiving official written documents such as invoices, reminders and account statements in business and private area gets more and more important. Creating appropriate index entries for document archives like sender’s name, creation date or document number is a tedious manual work. We present a novel approach to handle automatic indexing of documents based on generic positional extraction of

متن کامل

Document Compression Using H.264/AVC

It has been verified that H.264/AVC, the newest video compression standard, can also be used to encode still images. In many cases, it outperforms state-of-art coders such as JPEG2000. For compound documents, the gains over JPEG2000 are even more expressive. In this scenario, the contributions of the present paper are distributed over four document encoding methods that use the H.264/AVC as a b...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012